Gene Ontology Similarity Measures Based on Linear Order Statistics
نویسندگان
چکیده
The standard method for comparing gene products (proteins or RNA) is to compare their DNA or amino acid sequences. Additional information about some gene products may come from multiple sources, including the set of Gene Ontology (GO) annotations and the set of journal abstracts related to each gene product. Gene product similarity measures can be based on evaluating sets of descriptor terms found in the GO taxonomy, and/or the index term sets of the related documents (MeSH annotations). While our techniques can be applied to term sets from any taxonomy, we restrict our examples in this article to GO annotations. We investigate the use of linear order statistics (LOS) to build similarity relations on pairs of terms that are used in the GO as linguistic descriptors of genes and gene products. One of our objectives is to investigate the construction and utility of visual 2 Keller et al. assessments of relational data (in this case, dissimilarity matrices) for discovering tendencies of groups of gene products to "cluster together". We use gene product data derived from a group of 194 gene products representing three protein families extracted from ENSEMBL. Our examples suggest that LOS similarity measures are more effective than traditional sequence-based similarity measures at capturing relationships between pairs of gene products in ENSEMBL families when annotation information is available. We show examples of how these similarity measures can assist in knowledge discovery and gene product family validation.
منابع مشابه
A Topology-Based Metric for Measuring Term Similarity in the Gene Ontology
The wide coverage and biological relevance of the Gene Ontology (GO), confirmed through its successful use in protein function prediction, have led to the growth in its popularity. In order to exploit the extent of biological knowledge that GO offers in describing genes or groups of genes, there is a need for an efficient, scalable similarity measure for GO terms and GO-annotated proteins. Whil...
متن کاملInformation Content-Based Gene Ontology Functional Similarity Measures: Which One to Use for a Given Biological Data Type?
The current increase in Gene Ontology (GO) annotations of proteins in the existing genome databases and their use in different analyses have fostered the improvement of several biomedical and biological applications. To integrate this functional data into different analyses, several protein functional similarity measures based on GO term information content (IC) have been proposed and evaluated...
متن کاملInformation Content-Based Gene Ontology Semantic Similarity Approaches: Toward a Unified Framework Theory
Several approaches have been proposed for computing term information content (IC) and semantic similarity scores within the gene ontology (GO) directed acyclic graph (DAG). These approaches contributed to improving protein analyses at the functional level. Considering the recent proliferation of these approaches, a unified theory in a well-defined mathematical framework is necessary in order to...
متن کاملAn ontological hybrid recommender system for dealing with cold start problem
Recommender Systems ( ) are expected to suggest the accurate goods to the consumers. Cold start is the most important challenge for RSs. Recent hybrid s combine and . We introduce an ontological hybrid RS where the ontology has been employed in its part while improving the ontology structure by its part. In this paper, a new hybrid approach is proposed based on the combination of demog...
متن کاملMethods of Normalization the Results of Gene Ontology Term Similarity
The article addresses the issue of improvement of the results quality when Gene Ontology (GO) term similarity is calculated. Several GO similarity measures produce results out of the range [0; 1]. Whereas, in order to compare different similarity measures or apply further processing, it is needed to normalise the results to this range. The most popular and well-known method of normalization is ...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
- International Journal of Uncertainty, Fuzziness and Knowledge-Based Systems
دوره 14 شماره
صفحات -
تاریخ انتشار 2006